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Abstract - Data Mining is discovery of a unknown 
relationships associations, groupings, classifiers 
from data. Association rule mining (ARM) is a 
knowledge discovery technique used in various data 
mining applications. The task of discovering 
scalable rules from the multidimensional database 
with reduced support is an area for exploration for 
research . Pruning is a technique for simplifying and 
hence generalising a decision tree. Error-Based 
Pruning replace sub-trees with leaves .It uses 
decision class is the majority. In this paper we have 
proposed an algorithm DataAprori to generate 
scaled rules using the alarm technique. Network 
problems manifest themselves as an alarm 
sequence. Since network problems repeat more or 
less frequently, processing of alarm sequences from 
alarm history can be good base for creation of 
correlation rules that will be used in the future, when 
the same problem will appear. In this paper we have 
proposed DataAprori that induces a set of rules of 
the potential usage of the mathematical Apriori 
algorithm in fault management introducing logical 
inventory data in typical alarm by introducing the 
sequence detection processes. Experimental on real 
world datasets show that the proposed approach 
improves performance over existing approach in the 
form of High level-correlations (alarm sequences) 
which are detected in a telecommunication network. 

Keywords: Data mining , ABCDE architecture , pruning, 
Aprori technique. 

I. Introduction: 

AIREP learns the clauses in the order in which they will 
be used by a PROLOG interpreter. Before subsequent 
rules are learned, each clause is completed (learned and 



pruned) and all covered examples are removed. 
Therefore, the AIREP approach eliminates the problem 
of incompatibility between the separate-and conquer 
learning strategy and the reduced-error pruning 
strategy.Typically, a network problem is represented by 
the number of alarms coming from one or more network 
elements. If the alarms are coming from more than one 
network element, it is reasonable to expect that the 
network elements are interconnected. If we have a 
logical inventory database at our disposal (i.e., database 
where information about network element 
interconnections is stored), we can try to include it in the 
discovery environment. How? We can consider only the 
clusters containing alarms from interconnected network 
elements. 

Since a logical inventory database is not always 
available, there is a possibility to "generate" it, based on 
the alarm historical data. In that case, we will first 
analyze alarms by their location only. After that analysis 
we will have information about the most frequent points 
of interconnection. This data can be stored in a logical 
inventory database (using a predefined threshold) and 
can be used in the cluster splitting process in the future. 
This concept is described in [7]. 

This research work is the extension of the previous work 
where we have proposed Aprori-UB which uses 
multidimensional access method UB-tree to generate 
efficient association rules with high support and 
confidence[19][20]. The Aprori-Ub approach reduces not 
only the number of item sets generated but also the 
overall execution time of the algorithm. In this paper we 
have used the abcde architecture for high-level 
correlations discovery as well as typical patterns that can 
be used for low-level correlations and filtrations[48][49]. 
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The paper is organized as follows. Section 2 gives the 
overview of the previous work done in the same field. 
Section 3 explains the concepts used in this paper. 
Section 4 gives the proposed work. Section 5 gives the 
experimentation details. Section 6 and Section 7 
discusses the conclusion and future scope. 

II. Related Work 



We define ,C k as a candidate itemset of size k ,Z k as a 
frequent itemset of size k, An AIREP algorithm is 

1) Find frequent set L k _i 

2) Join step: C k is generated by joining L k ^ with 
itself (cartesian product Lm x L k -i) 

3) Prune step : Use thelncremental Reduced 
Error pruning to generate scalable single 
rule. 

4) Frequent set L k has been achieved. 

The AIREP (Aprori Incremental Reduced Error Pruning) 
pseudo code : 

AIREP (T, u) 

Z1 <r large multidimensional itemsets that appear in 
more than 

Of large item set u transactions 
K <-2 

While ( Z M * 0 ) 

Ck <- Generate (Z k ., ) //join and prune step 
//using IREP 

procedure l-REP (Examples, SplitRatio) 
Theory = IS ; 
While Positive (Examples) ^ B; 
Clause = ffl; 

Split Examples (Split Ratio, Examples, Growing Set, 
Pruning Set) 

Cover = Growing Set 

While Negative (Cover) * H ; 

Clause = Clause H Find Literal (Clause; Cover) 



Cover = Cover (Clause, Cover) 

loop 

NewClause = BestSimplification (Clause, 
PruningSet) 

if Accuracy(NewClause,PruningSet) < 

Accuracy(Clause, PruningSet) 

exit loop 

Clause = NewClause 

if 

Accuracy(Clause,PruningSet)<=Accuracy(fail, PruningSet 

) 

exit while 

Theory = Theory H Clause 
Examples = Examples -Cover 

return (Theory) 
//end of IREP 
//frequent set generation 

for transaction t € Z 

Ck <- Subset(C k ,t) 

for candidates c€ Ct 

count[c] =count[c + 1] 

Z k <- { c € C k | count[c] >= e} 

k <- k+1 

return Z k 

Figure 1 : Pseudocode of AIREP algorithm 

The basic idea of Incremental Reduced Error Pruning 
(IREP) is that instead of first growing a complete concept 
description and pruning it thereafter, each individual 
clause will be pruned right after it has been generated. 
This ensures that the algorithm can remove the training 
examples that are covered by the pruned clause before 
subsequent clauses are learned thereby preventing 
these examples from influencing the learning of 
subsequent clauses. 

Figure 1 shows pseudo-code for this algorithm. As usual, 
the current set of training examples is split into a growing 
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(usually 2/3) and a pruning set (usually 1/3). However, 
not an entire theory, but only one clause is learned from 
the growing set. Then, literals are deleted from this 
clause in a greedy fashion until any further deletion 
would decrease the accuracy of this clause on the 
pruning set. Single pruning steps can be performed by 
submitting a one-clause theory to the same 
BestSimplification subroutine used in REP or, as in our 
implementation, one can use a more complex pruning 
operator that considers every literal in a clause for 
pruning. The best rule found by repeatedly pruning the 
original clause is added to the concept description and 
all covered positive and negative examples are removed 
from the training growing and pruning set. The remaining 
training instances are then redistributed into a new 
growing and a new pruning set to ensure that each of the 
two sets contains the predefined percentage of the 
remaining examples. From these sets the next clause is 
learned. When the predictive accuracy of the pruned 
clause is below the predictive accuracy of the empty 
clause (i.e., the clause with the body fail), the clause is 
not added to the concept description and l-REP returns 
the learned clauses. Thus, the accuracy of the pruned 
clauses on the pruning set also serves as a stopping 
criterion. Post-pruning methods are used as pre-pruning 
heuristics. 

In figure 2 the attributes of the dataset are divided into 
instances and converted into divided attributes. In order 
to build a rule, I REP uses the following strategy. First the 
uncovered examples are randomly partitioned into two 
subsets, a growing set and a pruning set. Next, a rule is 
grown. It begins with an empty conjunction of conditions, 
and considers adding to this any condition of the form Z n 
= Ui , Z n <= ffl or Z >= E where Z n is a nominal attribute 
and u is a legal value for Z n , or Z c is a continuous 
variable and ffl is some value for Z c that occurs in the 
training data. After growing a rule, the rule is immediately 
pruned . 

After growing a rule, the rule is immediately prunedTo 
prune a rule, our implementation considers deleting any 
final sequence of conditions from the rule and chooses 
the deletion that maximizes the function 



u(Rule,PrunePos,PruneNeg) X + (N -n) 

X + N 



where X (respectively N), is the total number of 
examples in PrunePos ,PruneNeg and p ,n, is the 
number of examples in PrunePos ,PruneNeg covered 
by Rule.This process is repeated until no deletion 
improves the value of u. 
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Figure 2. Partitioning of original data set of labelled instances 

III. Concept Used 

ALARM BASIC CORRELATIONS DISCOVERY 
ENVIRONMENT ARCHITECTURE (ABCDE) 

A. ABCDE architecture overview 
Correlation and filtration rules database contains data 

about correlations and filtrations to be performed in 
realtime manner by alarm processing engine. Rules from 
this database are proposed by Correlation discovery and 
analysis module. This module can be used for discovery 
of new potential rules performing data mining algorithm 
on historical alarm data. It can be used for analysis and 
evaluation of potential rule candidates also, performing 
rule execution on sample of historical alarm data. 
Filtration part of Correlation discovery and analysis 
module discovers and evaluates potential filter patterns 
.Alarm data warehouse is a database containing all raw 
alarm history data as well as correlated alarm history 
data for a certain time period, predefined by the operator 
(e.g. 2 years). [1]Alarm data warehouse is starting point 
for discovery and analysis of typical correlations from 
alarm historical data, in order to include it in the 
Correlation and filtration rules database[2]. 

Correlations are used to determine the root cause of a 
fault and to filter out redundant alarms (JacquesH. Bellec 
2006). A lot of effort have been made researching alarm 
correlations, resulting in that all alarm systems support 
advanced filtering mechanisms, Wallin et al. (2009) 
argues that the problem lies in defining the rules used 
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to filter the alarms. By filtering out all redundant alarms 
the network operators would only have to handle 
relevant alarms which would make the network 
management centermore efficient (Wallin et al. 2009). In 
a survey from 2009 one representative for a leading 
telecom operator estimated the use of alarm correlation 
to 1-2% of all the alarms and the overall attitude of the 
survey was that the technique is expensive and complex 
(Wallin & Leijon 2009)[82]. 



Preprocessing 
->DBMS 



Alarm Analysis 
-^Data mining 
►-^Clustering 



Alarm correlation 
-^Aggregation 
-^Association 
rules 



Figure3 Correlation rule generation process. 

Incoming network alarms are generated by the 
telecommunication network. Alarms are consumed and 
processed by alarm processing engine that performs 
alarm filtration as well as low and high-level correlation. 

Processed alarms are presented to the network operator 
through alarm surveillance GUI. Alarm processing 
engine uses correlation and filtration rules stored in 
database, while incoming alarms are stored into alarm 
data warehouse. 

Logical inventory database containing data about 
network interconnections can be use for more efficient 
alarm correlation. Logical inventory data can be used for 
enhancement of incoming alarm data also, tying relevant 
inventory information with alarm data (for 
instance, "friendly" alarm location name). Alarm 
processing engine is not the focus of this paper since 
number of commercial tools is able to perform alarms 
processing functions. 

The description of similarity between alarms given by 
Julisch is based on defined taxonomies. The closest 
their attributes are within certain taxonomy, the more 
similar two alarms willLogical inventory data should be 
obtained from network operator. However, if it is not 
obtainable, there is proposed technique how to extract 
logical inventory data from alarm history. It was 
described in [7], and it is not primary focus of this paper. 
However, it was denoted on figure 2 through Logical 
inventory block. 

When clusters are generated, the Apriori algorithm is 
performed. The final result is the number of alarm 
sequences that occurred frequently in the past. Those 
sequences are potential high-level correlation rules 
candidates for future alarm processing. Criteria for 
acceptation of those candidates can be rule frequency, 
but also rule can be accepted based on network expert's 
opinion. 



3.1 ALARM database 

To effectively probe for statistics in the database the 
above mentioned limitations and a simple rule where 
used when filtering and rebuilding the database. 
AlarmType is a naming convention and a variable 
described in the alarm standard document X.733 (a 
standard in telecommunications). The alarm type 
variable is used in the uam for alarm name and mapping 
to the original alarm specifications. If the alarm type 
could not be found the alarm was deleted from the 
database. [76,75] 

After several years of research on IDS, the variety of 
results obtained has made the scientific community 
conclude that further research is needed to fine tune 
these systems. Large organisation and companies are 
already setting up different models of IDS from different 
vendors. Nevertheless, they provide an unmanageable 
amount of alarms. Inspecting thousands of alarms per 
day and sensor [1] is not feasible, specially if 99% of 
them are false positives [2]. Due to this impracticability, 
during the last four years research on intrusion detection 
systems has focused on how to handle alarms. The main 
objectives of these investigation works are: reduce the 
amount of false alarms, study the cause of these false 
positives, create a higher level view or scenario of the 
attacks, and finally provide a coherent response to 
attacks understanding the relationship between different 
alarms. 

For instance, the manager can decide to launch chips 
discount for every customer buying 6 beers. The 
previously mentioned special offer seems to be very 
logical, based on our daily experience. However, there 
are numbers of such association rules that cannot be 
perceived by casual observation. Hence, the manager is 
forced to analyze the supermarket's transaction data 
(i.e., customer receipt archive or database) - to examine 
customer behavior while purchasing products. The result 
of such analysis is a set of typical association rules 
describing how often items are purchased together. For 
instance, rule "Beer _ Chips (80%)" states that four of 
five customers buying beer are also buying chips [3]. 
That result can be useful for business decisions related 
to marketing, pricing and product promotion. 

We have considered our alarms as products purchased 
in a supermarket, and alarm clusters as baskets from a 
specific customer. Hence we have decided to use the 
Apriori algorithm in order to find and recognize specific 
alarm sequences - potential correlation rules for the 
future [2]. Apriori algorithm itself is described in number 
of papers such as [3]. The final result of high-level 
correlations is the creation of a correlation rules 
database. Rules are structured in an IF-THEN manner. It 
means that the alarm processing engine will receive 
incoming alarm stream matching incoming patterns with 
existing patterns in the correlation rules database. When 
a pattern is matched, a new alarm is generated 
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containing information about the real network root-cause 
problem. 

IV. Proposed work 

The Proposed Algorithm : Pseudocode 

• Join Step: Ck is used with Lk-1 

• Prune Step: Any (k-l)-itemset that is not frequent 
cannot be a 

subset of a frequent k-itemset 

• Pseudo-code: Ck 

: Candidate itemset of size k Lk 

: frequent itemset of size kL Input: alarm queue (Sij 

, Wk) 

Output: t frequent alarm sequence set: F_ ALARMm 

1. compute C1:={ a | aeFALARMI}; 

2. m:=1; 

3. while Cn#0 do 

4. begin 

5. For all aeCm , Search alarm queue Sij 
to find support(a, Wk); /'Algorithm 2 7 

6. Obtain F_ALARMm={ aeCm| support(a, Wk)> 
min_support}; 

7. Generate Candidate Cm+1 from FALARMm; /* 
Algorithm 3 7 

8. m=m+1; 

9. end. 

10. for all m , output F ALARMm; {frequent items}; 
for (k = 1 ; Lk ;!=0; k++) do begin 

Ck+1 

dataset generated from Lk; 

for each transaction t in database do 

increment the values that are contained in t 

Lk+1 // candidates in Ck+1 

end 

return the resultant rules. 
Lk 

Figure 4:DataAprori algorithm 

Alarm correlation algorithm (Algorithm 1) is composed of 
two main steps. In the first step, according to the 
minimum support(Min_support), it searches the frequent 
alarm type sequence from alarm queues and the 
discovered frequent alarm type sequences constitute the 
set of frequent alarm type sequences, denoted by 
F_ALARMm. In the second step, according to the 
confidence of correlation rule .It generates the alarm 
correlation rules from F ALARMm. The association 
rules algorithm and its measure of association rule S-»T, 
which is defined as confidence(S-^T) ,=Support(ST/ 
Support(S), where S and T correspond to a set of 
attributes and S and T are disjoint. 
The support and confidence of an association rule S->Y 
are defined asSupport=P[ST] and 

Confidence=P[ST]/S[Tj. The confidence is the 
conditional probability of Tgiven S. If S and T are 
independent, then Confidence =P[ST]/P[S]=P[Tj. 
Therefore, if P[T] is high, then the confidence of the rules 



is high, which will make association rule meaningless. In 
order to solve the problem. The interestingness measure 
l=P(ST)/(P(S)xP(T)). The interestingness measure is 
symmetrical, because the confidence of S->Y is equal to 
the one of T->S.. A rule holds if and only if the 
confidence of rule is greater than min_conf. 

Input: Frequent alarm sequence set F ALARMm 
Output: output the correlation rules (3^(a-(3) and 
confidence |P(a)/P((3)-P(a-f3)| 

1. for all auF_ALARMm do /* generate correlation rules 

7 

2. for all f3Lla do 

3. if|P(a)/P((3)-P(a-(3)|>min_conf then 

4. begin 

5. generate the rule p^ (a-p) with 

6. confidence |P(a)/P(P)-P(a-P)| ; 

7. end 



V IMPLEMENTATION ASPECTS AND 
EXPERIMENTAL RESULTS 

DataAprori components are developed using C and C++ 
programming languages, as a parts of complex 
application. Central application component is executable 
file thatinvolves different dynamic-linked libraries (dll) in 
architecture. Every part is implemented as separated dll. 
It allows upgrade of separated components without 
disturbing general application structure. 

For database access we have used Open Database 
Connection (ODBC) with all data stored in MS SQL 
server. For database access we have used standard 
MFC classes, but all other techniques could be used. 
The data in experiment 1 are the alarms in GSM 
Networks, which contain 181 alarm types and 91311 
alarm events. The time of alarm events ranges from 
1201-03-15-00 to3001 -03-79-52. In figure 4 the broken 
line graph is denoted by win_xy, where x represents the 
size of additional alarm window i.e.Win_add and y 
represents the size of frequent alarm window i.e. 
Win_freq. In figure 4 the Y axis is the number of alarm 
type sequences and the X axis is Mini_support (using 
the minimum occurring times) 
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Figure 5: The number of frequent sequences changes in DataAprori. 
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Figure 6 : Comparison between dataaprori and aprori method. 

From Figure 6, we can find that the reduction rate of our 
method is a little better than aprori method. However 
aprori method is not able to filter dataset in real time. It 
can distinguish true alerts and false ones onOur method 
has low time consumption as compared to the aprori 
method . Moreover,this method needs a lot of labeled 
data to build its modeland can not filter alerts in training 
phrase, while our method does not have these limits. So 
using our method,security managers can response to 
attacks more quickly. From above comparison, we 
believe that our system has better performance than 
current methods. 




Figure 7 : Comparison of generated rules. 



VI. Conclusion 

Since the DataAprori algorithm can analyze alarm 
correlation from alarm database containing noise data, it 
will generate more alarm sequences, then the number of 
correlation rules increases. Although the correlation 
measure can reduce the rules, it still needs people to 
select the most useful ones from a large number of the 
rules. Therefore, it is necessary to study how to extract 
rules more correlated from alarm database containing 
noise in the future. This number can be reduced if we 
discover some frequently repeated alarm sequences, 
and replace it by one alarm. For that purpose, we have 
used Apriori algorithm, as we discussed in our previous 
work. However, after sequences are detected, it is 
necessary to "judge" which sequence is relevant for 
future and which is not. One of criteria can be frequency 
of alarm sequence appearing. 

Also, some sequences can be very relevant, event if 
those are not repeated very frequently. DataAprori can 
be used fordiscovery and statistical processing of alarm 
sequences,while final decision should be made by 
human operator.According to our previous and other 
related works [12], reduction rate at high-level 
correlations can be rather high, up to 80%. Using test 
data sample and finding several alarm sequences 
confirmed by network experts, reduction rate was 15.41 
%. 

VII. Future work 

Further research efforts should be invested into the full 
implementation of proposed architecture, improving and 
introducing new data mining techniques for high-level 
correlations discovery as well as typical patterns that can 
be used for low-level correlations and filtrations. Fuzzy 
technique can also be improvised in the proposed 
DataAprori in future. 
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